Model Formulation: Automated Diagnosis of Data-Model Conflicts Using Metadata
نویسندگان
چکیده
The authors describe a methodology for helping computational biologists diagnose discrepancies they encounter between experimental data and the predictions of scientific models. The authors call these discrepancies data-model conflicts. They have built a prototype system to help scientists resolve these conflicts in a more systematic, evidence-based manner. In computational biology, data-model conflicts are the result of complex computations in which data and models are transformed and evaluated. Increasingly, the data, models, and tools employed in these computations come from diverse and distributed resources, contributing to a widening gap between the scientist and the original context in which these resources were produced. This contextual rift can contribute to the misuse of scientific data or tools and amplifies the problem of diagnosing data-model conflicts. The authors' hypothesis is that systematic collection of metadata about a computational process can help bridge the contextual rift and provide information for supporting automated diagnosis of these conflicts. The methodology involves three major steps. First, the authors decompose the data-model evaluation process into abstract functional components. Next, they use this process decomposition to enumerate the possible causes of the data-model conflict and direct the acquisition of diagnostically relevant metadata. Finally, they use evidence statically and dynamically generated from the metadata collected to identify the most likely causes of the given conflict. They describe how these methods are implemented in a knowledge-based system called GRENDEL and show how GRENDEL can be used to help diagnose conflicts between experimental data and computationally built structural models of the 30S ribosomal subunit.
منابع مشابه
Metadata Enrichment for Automatic Data Entry Based on Relational Data Models
The idea of automatic generation of data entry forms based on data relational models is a common and known idea that has been discussed day by day more than before according to the popularity of agile methods in software development accompanying development of programming tools. One of the requirements of the automation methods, whether in commercial products or the relevant research projects, ...
متن کاملشناسایی روابط کتابشناختی در فهرست کتابخانه ملی ایران مبتنی بر الگوی ملزومات کارکردی پیشینههای کتابشناختی (اف آر بی آر): گام نخست در بازنمون شبکه دانش انتشارات ایرانی-اسلامی
The aim of this study is to find out the bibliographic relationships between the metadata records in the National Library and Archives of Iran (NLAI) according to FRBR model, in order to represent the Knowledge network of Iranian-Islamic publications. To achieve this objective, the content analysis method was used. The study population includes metadata records for books in NLAI for four biblio...
متن کاملپیش بینی بیماریهای کبدی با استفاده از مدل مارکف پنهان
Background: The liver is the largest internal organ and the most important organ after heart and brain in the human body without which life is impossible. Diagnosis of liver disease requires a long time and sufficient expertise of the doctor. Statistical methods can be classified as an automated forecasting system and help specialists for quickly and accurately diagnose liver disease. Hidden Ma...
متن کاملA Conceptual Model for Underlying Factors of Parent-Adolescent Conflicts from Parents’ Perspective
Parent-adolescent conflict, which is affected by many factors, is one of the most important problems in many families with adolescents. This study, which was conducted via a qualitative method on the basis of grounded theory, aimed at identifying the underlying factors of parent-adolescent conflicts. Using theoretical, purposive, and voluntary sampling, a total number of 14 couples were selecte...
متن کاملA New Formulation for Cost-Sensitive Two Group Support Vector Machine with Multiple Error Rate
Support vector machine (SVM) is a popular classification technique which classifies data using a max-margin separator hyperplane. The normal vector and bias of the mentioned hyperplane is determined by solving a quadratic model implies that SVM training confronts by an optimization problem. Among of the extensions of SVM, cost-sensitive scheme refers to a model with multiple costs which conside...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of the American Medical Informatics Association : JAMIA
دوره 6 5 شماره
صفحات -
تاریخ انتشار 1999